15 November 2022
# A tsibble: 168 x 2 [1M]
Month Turnover
<mth> <dbl>
1 2006 Jan 1914.
2 2006 Feb 1750
3 2006 Mar 1984.
4 2006 Apr 1967.
5 2006 May 2005.
6 2006 Jun 1944.
7 2006 Jul 2019.
8 2006 Aug 2043.
9 2006 Sep 2039.
10 2006 Oct 2113.
# … with 158 more rows
auscafe |>
filter(year(Month) <= 2017) |>
model(
ets = ETS(Turnover),
arima = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
snaive = SNAIVE(Turnover)
) |>
mutate(ensemble = (ets + arima + snaive)/3)# A mable: 1 x 4
ets arima snaive ensemble
<model> <model> <model> <model>
1 <ETS(M,A,M)> <ARIMA(0,1,1)(0,1,1)[12]> <SNAIVE> <COMBINATION>
auscafe |>
filter(year(Month) <= 2017) |>
model(
ets = ETS(Turnover),
arima = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
snaive = SNAIVE(Turnover)
) |>
mutate(ensemble = (ets + arima + snaive)/3) |>
forecast(h = "2 years")# A fable: 96 x 4 [1M]
# Key: .model [4]
.model Month Turnover .mean
<chr> <mth> <dist> <dbl>
1 ets 2018 Jan N(3753, 4335) 3753.
2 ets 2018 Feb N(3434, 4937) 3434.
3 ets 2018 Mar N(3801, 7735) 3801.
4 ets 2018 Apr N(3736, 9189) 3736.
5 ets 2018 May N(3782, 11269) 3782.
6 ets 2018 Jun N(3666, 12416) 3666.
7 ets 2018 Jul N(3876, 16022) 3876.
8 ets 2018 Aug N(3923, 18713) 3923.
9 ets 2018 Sep N(3878, 20630) 3878.
10 ets 2018 Oct N(4010, 24681) 4010.
# … with 86 more rows
auscafe |>
filter(year(Month) <= 2017) |>
model(
ets = ETS(Turnover),
arima = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
snaive = SNAIVE(Turnover)
) |>
mutate(ensemble = (ets + arima + snaive)/3) |>
forecast(h = "2 years") |>
accuracy(
data = auscafe,
measures = list(rmse=RMSE)
) |>
arrange(rmse)# A tibble: 4 × 3
.model .type rmse
<chr> <chr> <dbl>
1 ensemble Test 28.7
2 ets Test 88.2
3 arima Test 111.
4 snaive Test 174.
According to the democratic principle of “one vote one value”, the middlemost estimate expresses the vox populi.
“The combined forecasting methods seem to me to be non-starters . . . Is a combined method not in danger of falling between two stools?” Maurice Priestley
The first genuine forecasting competition
The first genuine forecasting competition
Conclusions
The first genuine forecasting competition
Consequences
Researchers started to:
Winning methods
Suppose we have N different forecasting methods.
\hat{\bm{y}}_{T+h|T} = N-vector of h-step-ahead forecasts using information up to time T.
\bm{\Sigma}_{T+h|T}= N\times N covariance matrix of the h-step forecast errors.
\bm{1}= unit vector.
Simple combination: \tilde{y}_{T+h|T} = N^{-1}\bm{1}'\hat{\bm{y}}_{T+h|T}
Linear combination: \tilde{y}_{T+h|T} = \bm{w}'_{T+h|T}\hat{\bm{y}}_{T+h|T}
Optimal combination (minimizing MSE):
\bm{w}_{T+h|T} = \frac{\bm{\Sigma}_{T+h|T}^{-1}\bm{1}}{\bm{1}'\bm{\Sigma}_{T+h|T}^{-1}\bm{1}'} (Bates & Granger, 1969; Granger & Newbold, 1974)
Regress y_{t+h} against \hat{\bm{y}}_{t+h|t} for t=1,\dots,T.
Simple average combinations almost always give more accurate forecasts than other combination methods.
Feature-based FORecast Model Averaging
Ensemble forecasting: mix the forecast distributions from multiple models.
Ensemble forecasting: mix the forecast distributions from multiple models.
Let F_{i,T+h|T}(y) be the h-step forecast distribution for the ith method. Then \begin{align*} \tilde{F}_{T+h|T}(y) &= \sum_{i=1}^N w_iF_{i,T+h|T}(y) \\ \tilde{\mu} &= \sum_{i=1}^N w_i\mu_i \\ \tilde{\sigma}^2 &= \sum_{i=1}^N w_i\sigma_i^2 + \sum_{i=1}^N w_i(\mu_i - \tilde{\mu})^2 \end{align*}
More diverse forecasts may inflate the variance, but mitigates against over-confidence.
\begin{align*} f_{p,t} &= \text{quantile forecast with prob. $p$ at time $t$.}\\ y_{t} &= \text{observation at time $t$} \end{align*}
Q_{p,t} = \begin{cases} 2(1 - p) \big|y_t - f_{p,t}\big|, & \text{if $y_{t} < f_{p,t}$}\\ 2p \big|y_{t} - f_{p,t}\big|, & \text{if $y_{t} \ge f_{p,t}$} \end{cases}
# A tibble: 4 × 4
.model .type rmse crps
<chr> <chr> <dbl> <dbl>
1 ensemble Test 28.7 52.5
2 ets Test 88.2 57.4
3 arima Test 111. 64.8
4 snaive Test 174. 99.7
# A tsibble: 1,344 x 3 [1M]
# Key: State [8]
Month State Turnover
<mth> <chr> <dbl>
1 2006 Jan ACT 21.7
2 2006 Feb ACT 21.9
3 2006 Mar ACT 24.9
4 2006 Apr ACT 24.8
5 2006 May ACT 27
6 2006 Jun ACT 24.5
7 2006 Jul ACT 24
8 2006 Aug ACT 26.1
9 2006 Sep ACT 26.2
10 2006 Oct ACT 33.7
# … with 1,334 more rows
# A tsibble: 1,152 x 3 [1M]
# Key: State [8]
Month State Turnover
<mth> <chr> <dbl>
1 2006 Jan ACT 21.7
2 2006 Feb ACT 21.9
3 2006 Mar ACT 24.9
4 2006 Apr ACT 24.8
5 2006 May ACT 27
6 2006 Jun ACT 24.5
7 2006 Jul ACT 24
8 2006 Aug ACT 26.1
9 2006 Sep ACT 26.2
10 2006 Oct ACT 33.7
# … with 1,142 more rows
cafe |>
filter(year(Month) <= 2017) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
)# A mable: 8 x 5
# Key: State [8]
State ETS ARIMA SNAIVE COMB
<chr> <model> <model> <model> <model>
1 ACT <ETS(M,Ad,M)> <ARIMA(2,1,2)(0,1,1)[12]> <SNAIVE> <COMBINATION>
2 NSW <ETS(M,Ad,M)> <ARIMA(1,1,0)(0,1,2)[12]> <SNAIVE> <COMBINATION>
3 NT <ETS(A,N,A)> <ARIMA(0,1,1)(1,1,0)[12]> <SNAIVE> <COMBINATION>
4 QLD <ETS(M,Ad,M)> <ARIMA(0,1,1)(1,1,1)[12]> <SNAIVE> <COMBINATION>
5 SA <ETS(M,Ad,M)> <ARIMA(0,1,1)(0,1,1)[12]> <SNAIVE> <COMBINATION>
6 TAS <ETS(A,A,A)> <ARIMA(0,1,1)(0,1,1)[12]> <SNAIVE> <COMBINATION>
7 VIC <ETS(M,Ad,M)> <ARIMA(0,1,2)(0,1,2)[12]> <SNAIVE> <COMBINATION>
8 WA <ETS(M,Ad,M)> <ARIMA(0,1,1)(0,1,2)[12]> <SNAIVE> <COMBINATION>
cafe |>
filter(year(Month) <= 2017) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "2 years")# A fable: 768 x 5 [1M]
# Key: State, .model [32]
State .model Month Turnover .mean
<chr> <chr> <mth> <dist> <dbl>
1 ACT ETS 2018 Jan N(56, 8.6) 56.0
2 ACT ETS 2018 Feb N(59, 14) 58.8
3 ACT ETS 2018 Mar N(66, 23) 65.7
4 ACT ETS 2018 Apr N(61, 25) 61.1
5 ACT ETS 2018 May N(64, 32) 63.9
6 ACT ETS 2018 Jun N(61, 34) 61.4
7 ACT ETS 2018 Jul N(62, 40) 62.3
8 ACT ETS 2018 Aug N(64, 47) 63.5
9 ACT ETS 2018 Sep N(62, 50) 62.4
10 ACT ETS 2018 Oct N(64, 59) 64.2
# … with 758 more rows
cafe |>
filter(year(Month) <= 2017) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "2 years") |>
accuracy(data = cafe,
measures = list(crps=CRPS, rmse=RMSE)
)# A tibble: 32 × 5
.model State .type crps rmse
<chr> <chr> <chr> <dbl> <dbl>
1 ARIMA ACT Test 5.13 8.90
2 ARIMA NSW Test 24.8 36.7
3 ARIMA NT Test 1.94 2.51
4 ARIMA QLD Test 16.9 22.4
5 ARIMA SA Test 15.3 23.5
6 ARIMA TAS Test 1.63 2.23
7 ARIMA VIC Test 17.8 28.3
8 ARIMA WA Test 24.3 39.1
9 COMB ACT Test 4.80 8.40
10 COMB NSW Test 24.5 19.9
# … with 22 more rows
cafe |>
filter(year(Month) <= 2017) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "2 years") |>
accuracy(data = cafe,
measures = list(ss=skill_score(CRPS))
)# A tibble: 32 × 4
.model State .type ss
<chr> <chr> <chr> <dbl>
1 ARIMA ACT Test -0.592
2 ARIMA NSW Test 0.485
3 ARIMA NT Test -0.0751
4 ARIMA QLD Test 0.0610
5 ARIMA SA Test -0.491
6 ARIMA TAS Test 0.292
7 ARIMA VIC Test 0.484
8 ARIMA WA Test -1.11
9 COMB ACT Test -0.489
10 COMB NSW Test 0.490
# … with 22 more rows
cafe |>
filter(year(Month) <= 2017) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "2 years") |>
accuracy(data = cafe,
measures = list(ss=skill_score(CRPS))
) |>
group_by(.model) |>
summarise(sspc = mean(ss) * 100)# A tibble: 4 × 2
.model sspc
<chr> <dbl>
1 ARIMA -11.8
2 COMB 7.10
3 ETS 12.7
4 SNAIVE 0
Skill score is relative to seasonal naive forecasts